Bayesian Nonparametric Collaborative Topic Poisson Factorization for Electronic Health Records-Based Phenotyping

نویسندگان

  • Wonsung Lee
  • Youngmin Lee
  • Heeyoung Kim
  • Il-Chul Moon
چکیده

Phenotyping with electronic health records (EHR) has received much attention in recent years because the phenotyping opens a new way to discover clinically meaningful insights, such as disease progression and disease subtypes without human supervisions. In spite of its potential benefits, the complex nature of EHR often requires more sophisticated methodologies compared with traditional methods. Previous works on EHR-based phenotyping utilized unsupervised and supervised learning methods separately by independently detecting phenotypes and predicting medical risk scores. To improve EHR-based phenotyping by bridging the separated methods, we present Bayesian nonparametric collaborative topic Poisson factorization (BN-CTPF) that is the first nonparametric contentbased Poisson factorization and first application of jointly analyzing the phenotye topics and estimating the individual risk scores. BN-CTPF shows better performances in predicting the risk scores when we compared the model with previous matrix factorization and topic modeling methods including a Poisson factorization and its collaborative extensions. Also, BN-CTPF provides faceted views on the phenotype topics by patients’ demographics. Finally, we demonstrate a scalable stochastic variational inference algorithm by applying BN-CTPF to a national-scale EHR dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Nonparametric Poisson Factorization for Recommendation Systems

We develop a Bayesian nonparametric Poisson factorization model for recommendation systems. Poisson factorization implicitly models each user’s limited budget of attention (or money) that allows consumption of only a small subset of the available items. In our Bayesian nonparametric variant, the number of latent components is theoretically unbounded and effectively estimated when computing a po...

متن کامل

Nonparametric Bayesian Matrix Factorization by Power-EP

Many real-world applications can be modeled by matrix factorization. By approximating an observed data matrix as the product of two latent matrices, matrix factorization can reveal hidden structures embedded in data. A common challenge to use matrix factorization is determining the dimensionality of the latent matrices from data. Indian Buffet Processes (IBPs) enable us to apply the nonparametr...

متن کامل

Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

We present a probabilistic formulation of max-margin matrix factorization and build accordingly a nonparametric Bayesian model which automatically resolves the unknown number of latent factors. Our work demonstrates a successful example that integrates Bayesian nonparametrics and max-margin learning, which are conventionally two separate paradigms and enjoy complementary advantages. We develop ...

متن کامل

Gamma Processes, Stick-Breaking, and Variational Inference

While most Bayesian nonparametric models in machine learning have focused on the Dirichlet process, the beta process, or their variants, the gamma process has recently emerged as a useful nonparametric prior in its own right. Current inference schemes for models involving the gamma process are restricted to MCMC-based methods, which limits their scalability. In this paper, we present a variatio...

متن کامل

Bayesian change point estimation in Poisson-based control charts

Precise identification of the time when a process has changed enables process engineers to search for a potential special cause more effectively. In this paper, we develop change point estimation methods for a Poisson process in a Bayesian framework. We apply Bayesian hierarchical models to formulate the change point where there exists a step < /div> change, a linear trend and a known multip...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016